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Q-e Abstract 

While the Quasi-Monte Carlo method of numerical integration achieves 
smaller integration error than standard Monte Carlo, its use in particle physics 
phenomenology has been hindered by the abscence of a reliable way to es- 
timate that error. The standard Monte Carlo error estimator relies on the 
assumption that the points are generated independently of each other and, 
OO . therefore, fails to account for the error improvement advertised by the Qua- 

si-Monte Carlo method. We advocate the construction of an estimator of 
stochastic nature, based on the ensemble of pointsets with a particular dis- 
. crepancy value. We investigate the consequences of this choice and give 

some first empirical results on the suggested estimators. 

a: 

èu. 1 Monte Carlo and Quasi-Monte Carlo 



> 
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1.1 Introduction 



- T— I | 

^ ■ In numerical integration, the main problem is not to obtain a numerical answer 3 

for the integrai, but rather, on the one hand, to ensure that the inherent numer- 
ical error is as small as possible, and, on the other hand, to estimate this error 
as precisely as possible. For integrands with well-known smoothness properties, 
a-priori estimates of the numerical error are possible, but for most practical ap- 
plications the smoothness properties of the integrand can only be investigated in 
the course of the integration itself, that is, by repeated numerical evaluation of the 
integrand. 

In this paper, we shall be concerned with the integration errors arising in Mon- 
te Carlo and Quasi-Monte Carlo integration. In these methods, the integration 
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nodes are distributed in a (more or less) stochastic manner, and the integration 
errors are therefore of an essentially probabilistic nature. The difference between 
Monte Carlo and Quasi-Monte Carlo is that in the former, the integration points 
are iid 4 unitomi in the integration region 5 , while in the latter the integration points 
are not chosen independently, but rather with an explicit interdependence so that 
their overall distribution is 'smoother', in a sense discussed below. 

In stochastic integration methods of the Monte Carlo or Quasi-Monte Carlo 
types, the integration error is itself an estimate, which contains its own error. That 
this is not an academic point becomes clear when we realize that the error esti- 
mate is routinely used to provide confidence levels for the integrai estimate (be 
it based either on Chebyshev or Central-Limit-Theorem, Gaussian rules); and a 
mis-estimate of the integration error can lead to a serious under- or overestimate 
of the confidence level. As an example, suppose that the Central Limit Theorem 
is applicable, so that the integration result is drawn from a Gaussian distribution 
centered around the true integrai value. One standard deviation, as estimated by 
Monte Carlo, corresponds to a two-sided confidence level of 68%. If the error 
estimate is off by 50% (admittedly a large value), the actual confidence level may 
then be anything between 38% and 87%. 

From this consideration, we are therefore led to a hierarchy of error estimates: 
the first-order error is that on the integrai estimate, while the second-order error 
is the error on the error estimate. This in turn has, of course, its own third-order 
error, and so on. Higher orders than the second one, however, appear to be too 
academic for practical relevance, but we should like to argue that, in any serious 
integration problem, the second-order error ought to be included. In what follows 
we shall discuss the first- and second-order error estimates. 

Due to the absence of a Quasi-Monte Carlo error estimator, users of Quasi- 
-Monte Carlo have been estimating the integration error with the classical Mon- 
te Carlo formula, as if the point set was iid. This systematically overestimates 
the error in any case where the quasi point-set is of any worth. Moreover, no 
confidence levels can be assigned since the classical estimator does not average to 
the error made by the quasi, non-iid point-sequence. The purpose of this paper is 

4 iid stands for 'independent, identically distributed' . 

5 This ignores the possible interpretation of stratifìed and importance sampling methods of vari- 
ance reduction. These can, at any rate, always be formulated in terms of methods using iid uniform 
integration points. 
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to investigate possible estimators for Quasi-Monte Carlo integration taking under 
consideration the non-iid nature of the underlying point-set 6 . 



1.2 Monte Carlo estimators 

In this section we briefly review the probabilistic theory underlying Monte Car- 
lo integration. This is of course well known, but we include it here so that the 
significant difference with Quasi-Monte Carlo can become clear. 

Throughout this paper we shall consider integration problems over the d-di- 
mensional unit hypercube C = [0, 1 ) d . The integrand is a function f (x), which we 
shall assume real and non-negative, and, of course, integrable over C. We shall 
defìne 



J, 



f(x) m d d x , m= 1,2,3,... , (1) 



c 



so that Ji is the required integrai. Note that J m is not necessarily finite for m > 2. 
In Monte Carlo we assume N integration points, to be chosen iid from the uniform 
probability distribution over C. This means that the point set X = {x*i , Xz, . . . , Xn} 
on which the integration is based is assunteci to be a typical member of an ensem- 
ble of such point sets, in such a way that the combined probability distribution of 
the N points over this ensemble is the uniform iid one: 

Pn(x*i,x 2 , . . . ,x N ) = 1 . (2) 

We shall take the averages over this ensemble. Other assumptions on the under- 
lying ensemble from which the point set X is believed to be chosen are possible, 
leading to a different form of Pn. In this, the situation is not different from that 
encountered in statistical mechanics. The above assumption, however, is the one 
that is always made in regular Monte Carlo and is justified to some extent by the 
fact that good-quality (pseudo)random number generators are actually available, 
allowing us to build ensembles of point sets X that indeed have the above property 
©. 



6 The opposite direction - re-introducing randomness by reshuffling the points of the Quasi- 
-Monte Carlo sequence in a way that preserves their uniformity properties, thus allowing for the 
use of a 'classical'-type estimator - has been studied extensively in the literature (see |9| and 
references therein). Such point-sequences behave better than Monte Carlo sequences and, for 
integrands with certain properties, as good as Quasi-Monte Carlo sequences. Estimating the error, 
however, requires the use of a number r of different reshuffiings of a point-set with n points, 
thereby trading off accuracy for knowledge of the error. 
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Let us assume that a point set X has been generateci, and the values of the 
integrane! f (x) at ali these points have been computed. These we shall denote by 
fj = f (x*j), j = 1 , 2, . . . , N. From these we can compute the discrete analogues of 
the integrals J m , which are computable in linear time (that is, time proportional to 
N): 

N 

S m = ^(fj) m . (3) 

The Monte Carlo estimate of the integrai is then 

E, = l Sl . (4) 
The expected value of Eì over the above ensemble of point sets is given by 

f(x)d d x = Ji , (5) 



N 

which is indeed the required integrai: this is the basis for the Monte Carlo method. 
Its usefulness appears if we compute the variance of Eì : 

a (E,) 2 = <Ef> - <E,> 2 = ± (J 2 - Jf) . (6) 

Since this decreases as N -1 , the Monte Carlo method actually converges for large 
N. Note that the leading, O(N ), terms of (E 2 ) and (Eì) 2 cancel against each 
other: this is a regular phenomenon in variance estimates of this kind 7 . The vari- 
ance cr (Eì ) is estimated by the fìrst-order error estimator (also called 'classical' 
or 'pseudo' estimator in what follows) 

E ' = ^-SsSf • < 7 > 

for which we have 

(E 2 > =0(EO 2 + O(N- 2 ) . (8) 

Since N is usually quite large, at least 10,000 or so, we feel justifìed in working 
only to leading order in N . The squared error of E 2 is computed to be, to leading 
order in N, 

o (E 2 ) 2 = ^(J 4 -4J 3 Jt - J| + 8J 2 jf - 4Jt) , (9) 



7 It should be pointed out that what we estimate is the average of the squared error, rather than 
the error itself, and squaring and averaging do not commute. In fact, this is another reason why 
the second-order estimate is relevant. 
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for which the estimator is 

E 4 = — (N 3 S4-4N 2 S 3 S! -N 2 S| + 8NS 2 S^-4S{) . (10) 
which can also be computed in linear time; we have 

(E 4 > = o-(E 2 ) 2 + 0(N- 4 ) . (11) 

Some details on the computation of leading-order expectation values of this type, 
as well as (for purposes of illustration) the form of the third- and fourth-order er- 
rar estimators, E 8 and E 16 , respectively, are given in the Appendix. 

A final remark is in order here. The Central Limit theorem, which ensures that 
the error estimate can be used to derive Gaussian confidence levels, can also be 
inferred from the computation of the higher cumulants of the error distribution: 
we find for the skewness 

((E 1 -(E 1 )) 3 ) = Ì 1(J3-3J 2 J 1 +2J3) , (12) 
and the unnormalized kurtosis: 

((E 1 -(E 1 )) 4 )-3o(E 1 ) 2 = ^3 (j 4 -4J 3 Ji-3J 2 +12J 2 J 2 -6Jt) , (13) 

which indicate that the higher cumulants decrease faster than the variance with 
increasing N; we shall examine this later on for the case of Quasi-Monte Carlo. 

1.3 Quasi-Monte Carlo estimators 

1.3.1 Multi-point distribution and correlation functions 

In contrast to the case of regular Monte Carlo, the technique of Quasi-Monte Carlo 
relies on point sets in which the points are not chosen iid from the uniform dis- 
tribution, but rather interdependently. To make this more specific, let us consider 
a point set X of N points. For each such a point set, we may define a measure of 
non-uniformity, called a discrepancy or, as in this paper, a diaphony. Its precise 
definition is presented below: for now, suffice it to demand that there exist a func- 
tion D(X) of the point set, which increases with its non-uniformity: D(X) = 
if the point set is perfectly uniform in ali possible respects, an ideal situation that 
can never be obtained for any finite point set. The Quasi-Monte Carlo method 
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consists of using point sets X for which D(X) has some value s which is (very 
much) smaller than (s), the value that may be expected for truly iid uniform ones. 

Given that such 'quasi-random' point sets can be obtained, how does one use 
them in numerical integration? The obvious issue here is to determine of what 
ensemble the quasi-random point set X can be considered to be a 'typical' mem- 
ber. In this paper, we should like to advocate the viewpoint that, since the main 
additional property of the quasi-random point set that distinguishes it from truly 
random point sets is its 'anomalously small' discrepancy D, the ensemble ought to 
consist of those point sets that are iid uniformly, with the additional constraint that 
the discrepancy D has the particular value D(X) = s for the actually used point 
set 8 . On this premise, the Quasi-Monte Carlo analogue of Eq.© would then be 
the assumption 

P n (s;x 1 ,x 2 ,... ) xn) = -— 6(D(X)-s) , (14) 

H(s) 

where s is, again, the observed value of the discrepancy of X, on which must 
now of course depend; and H(s) is the probability density to happen upon a point 
sets X with this discrepancy in the regular-Monte Carlo ensemble: 



Hfsì 



5(D(X) - s) d d X! d d x 2 d d x N (15) 



The actual computation of H(s) for given defìnition of the discrepancy is referred 
to the next section. What interests us here is the fact that P N is now no longer 
simply unity, since that would imply independence of the points in the point set. 
Let us therefore write the multi-pont distribution as 

Pn(s;xi,x 2) • • • ,Xn) = 1 - r7F 2 (s;Xi,X2, • • • ,Xn) , (16) 
where we have anticipated a factor 1 /N in the multi-point correlation F. 



1.3.2 Properties of the correlation function 

Since the value of the discrepancy of a given point-set X, should be independent 
of the order in which the points are generated, F k (s;x*i . . . x k ) must be totally 

8 We do not examine the possible alternative that the point sets in the ensemble must have 
discrepancy in the neìghborhood of the observed value s; this amounts to the distinction between 
the micro-canonical and the canonical ensemble in statistical mechanics. 
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symmetric; moreover, we must have 



Fic(s;xi,x 2 , ...,x K ) 



Fk+^sjX!,^, ...,x k ,x k+1 ) d d x k+1 , (17) 



which is not as trivial as it might seem since the value of the discrepancy, s, is 
based on the full N points and not on the smaller set of k or k + 1 points. Finally, 
for the Quasi-Monte Carlo integrai to be unbiased, we must have 

Pi(s;x 1 ) = 1 , (18) 

so that 

F 2 (s;x 1 ,x 2 ) d d x 2 = . (19) 

These remain, of course, to be proven and we shall do so in the next section, 
for a particular choice of discrepancy. Moreover, we shall show there that the 
multi-point correlation F N is, to leading order in 1/N, made up from two-point 
correlations F 2 : 

F k (s;x 1 ,x 2) . . . ,x k ) = Y_ F 2(s;x m ,x n ) . (20) 

l<m<n<k 

This establishes the properties of our ensemble of point sets X on which, in our 
view, the Quasi-Monte Carlo estimates ought to be based. 

1.3.3 Estimators 

We shall indicate the 'Quasi-Monte Carlo' nature of the estimators by the super- 
script ( q ) . The first estimator is that of the integrai: 

Here, and in the rest of this section, the sums will run from 1 to N . Denoting by 
the subscript ( q ) averages with respect to the 'quasi-random' ensemble discussed 
above, we then have 



Ll '(q) 



f(x)P 1 (s;x)d d x = J 1 , (22) 
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as before: owing to the fact that the one-point distribution is uniform, the Qua- 
si-Monte Carlo estimate is indeed as unbiased as the Monte Carlo one. The dis- 
tinction between the two methods appears in the fìrst-order error estimate. Let us 
define 

a(xi,X|) = 1 +F 2 (s;Xi,Xj) ; (23) 

then, we have 



a (E! 



(q)\ z = j_ 



h 



fif 2 ai; 



+ °{w. 



where we have adopted the straightforward convention for integrals 



f(Xi) f(X2) (X(Xi,X2) d d Xi d d X*2 , 



(24) 



(25) 



etcetera. As before, we shall insouciantly neglect terms that are sub-leading in 
1 /N . The advantage of the Quasi-Monte Carlo method is now clear: if we can 
ensure that otu > 1 'where it counts', that is, generally, when x^ and x*2 are 'close' 
in some sense, then the Quasi-Monte Carlo error will be smaller than the Monte 
Carlo one. A good Quasi-Monte Carlo point set, therefore, is one in which the 
points 'repel' each other to some extent. 

The fìrst-order error estimate is now simply 



N 

It is simple to show that, indeed 

:(q)\ 

2 Aq) " 

(q) 



(26) 



F, t! ■; o-fi;," ) ^ • OIK -i ; (27) 

however, evaluating E 2 4J is less trivial since it is not obvious how to do this in 
time linear in N . V 
can be evaluated to 



time linear in N. We shall discuss this later. The variance of the estimator E 2 q 



N 

+4 



fffjOij 



fifkfi<Xik<Xki + 4 



fifkflOikCXu 



fifjfkfi^ij^jkOCki +0{N 



-4ì 



(28) 
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for which the corresponding estimator (to leading order) is 

^ = ^(N 3 ^ft-4N 2 ^f?f j cx ij -N 2 ^f?ffa ij 
+4N Y_ f ? f kf i<Xik<Xki + 4N Y_ fff kfiOucOii 
-4 y ftfjfkfi<Xij<Xjk»ki) • (29) 

The details of this computation are discussed in the Appendix. It goes without 
saying that the substitution <x^ — > 1 will reduce ali the Quasi-Monte Carlo results 
to the regular Monte Carlo ones. 

We can now see why the 'classical' estimator Eq.© overestimates the error. 
Under the quasi distribution P2 of Eq. dTóT) the classical estimator averages to 




i(x)f(y)Hx,y) + O(^) (30) 



1 



The term involving the correlator is suppressed by A, which shows that E 2 aver- 
ages to something different than the variance of Et under the quasi distribution. 
Moreover, we will show in section 1231 that 9 the integrai of the suppressed term 
is strictly positive for any point-set that is better than a truly random one. So E2 
omits a strictly negative term when estimating the error. 

While it is true that the estimator Eq. (l26T) averages to a quantity whose leading 

order in N is equal to the leading order of (E2 1 ') > ^ suffers from the following 
disagreeable property: for a Constant integrand, while the first two terms vanish 
identically, the third approaches zero asymptotically from negative values. This 
leads to a negative squared error for ali practical purposes. Although this is not 
disastrous per se, it indicates the reason for the appearance of negative errors also 
for non-constant integrands, as will become apparent once we have a concrete 
expression for the correlation function. It is, thus, desirable to obtain an estimator 
that vanishes identically for Constant functions. This is achieved by 

= ^ L f ? - N^£ f i f j - m* £ WiJ " Fi * k " Fu + Fyd (31) 

i i,j k,l 



9 under fairly general conditions for the function f (x) 
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where the èy... denotes a sum with ali indices different, and Fy = F 2 (s;x\, Xj). 
This quantity averages to 

E 2 q2) ) = ^( J2 - li) - ^ dxdy dzdw f (x)f (y ) [F x>y - F x , w - F z>y + F Z ,J 

(32) 

which equals the leading part of (^2^) thanks to Eq.(fT9t. It is easy to check 
that the estimator of Eq. (l3TT) vanishes identically for a Constant integrand and any 
N, thanks to the antisymmetry property of the quadruple sum. 

1.3.4 Cumulants of Et 

As a final remark, we may also investigate the cumulants of the Quasi-Monte Car- 
lo estimator Ei. We write the expansion of the correlation function F k over 1 /N 
as 

F k (s; X! , . . . , x k ) = Fi 1 ' + 1f< 2) + ^Ff + . . . (33) 

and define 

M[^ >h = [ffà)* 1 • ■ . W^^s;*,. . . ,x k ) (34) 
It is evident that if Eq.dzH holds, we have 

]r 2 M§ (35) 

The cumulants are defined as 

c n= ((^-(^),J) (36) 



(q) 



(q) 



The variance of Ei is then 



'2 = ^(J2-j?-aO + o(^; 



(37) 



The skewness is 



c 3 = ^{j3-3J 1 J 2 +2jf-37WVj+3J 1 7W^+6J 1 A^^-A^^ il )+0(^) (38) 
The unnormalized kurtosis is 
c 4 -3ci = ^(-A^i 2 ^,! -3(A^i 1 , 1 ) ) 2 + 4J 1 A^S 2 1 \ 1 -ejfA^S^) + 0(^3) (39) 
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The above results indicate that a correlation function that satisfies the property of 
Eq.d2"0T) leads to a distribution whose skewness decreases faster with N than does 
the variance, but when it comes to the kurtosis (and higher cumulants), additional 
properties regarding the next-to-leading order expression for F (denoted above 
by Aiu ik ) are needed to secure Gaussian cumulants 10 . These properties hold 
whenever the saddle point approximation of Eq. d66H67t is valid. In such cases one 
expects Gaussian confidence levels for the Quasi-Monte Carlo estimator F_i . 



2 Multi-point distributions with diaphonies 
2.1 Diaphony 

We consider a point set X with N elements, given in C. The non-uniformity of the 
point set X can be described by its diaphony 11 : 



with 



N 



D(X) = -^P(x j) x lc ) , 



(40) 



e-a(x) = exp(2Ì7r n • x) . (41) 
, Tid) forni the integer lattice, and the hat de- 



Here, the vectors ri = (ri! , ri2, . 
notes the sum over ali ri except ri = 0. We may also write 

2 



D(X) 



1 

N 



N 

L 



SfvlXj 



(42) 



so that we recognize the diaphony as a measure of how well the various Fourier 
modes are integrated by the point set X. The diaphony is therefore seen to be 
related to the 'spectral test', well-known in the fìeld of random-number generator 
testing. For the mode strengths we have 



4>o , £4 = 1 



(43) 



Approach to a Gaussian distribution,for iid random variables, would require c n /(c2) n/ to 
approach for large N . 

"some of the concepts of this section have also been discussed in and |3 1. 
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The latter convention simply establishes the overall normalization of D. The ad- 
vantage of this diaphony over, say, the usuai (star)discrepancy is the fact that it is 
translation-invariant: 

&{x h x k ) = P(xj-xic) , (44) 

so that point sets X and X' that differ only by a translation (modulo 1) have the 
sanie non-uniformity: the diaphony is actually defined on the hyper-torus rather 
than on the hypercube. Also, the diaphony is tadpole-free: 

(3(x) d d x = . (45) 

c 

Moreover, we shall use <ji such that ai = <ji, if the two lattice vectors n and ri' 
differ only by a permutation of their components. Thus, X and X' will also have 
the sanie non-uniformity if they differ by a global permutation of the coordinates 
of the points. 

2.1.1 Some numerical results 

In this section the behavior of a specifìc diaphony is presented, for three point 
sequences, as the number of points N increases. 
The diaphony is defined by Eq. (l42l with 

o- = Ke~ Aft2 K~ 1 = Y_ e ~ Aft2 A = 0.1 (46) 

The reason for experimenting with this definition lies in the factorizing property 
of the o>t. Due to K 1 being related to Jacobi theta functions, we cali this the 
'Jacobi diaphony'. We will be using this diaphony in most of what follows. 

In this paper we will be using three point sequences that we will be calling 
Ranlux, Van Der Corput and Niederreiter. Ranlux is a pseudo- 
random point sequence generated by the Ranlux algorithm (see [4|) with luxury 
level equal to 3. Van Der Corput is a quasi-random sequence generated by an 
implementation of the algorithm by Halton that generalizes to many dimensions 
an older algorithm by Van der Corput (see 0) with prime bases 2, 3, 5, 7, 1 1 , . . .. 
Finally Niederreiter is another, optimal 12 quasi-random sequence based on 
the algorithm in (see [7 1). In particular, we follow the choices of [8 1 and construct 
the sequence in whichever base is optimal for the current dimension (see Q). 

12 in a sense described in 1 7 1 and 1 8 1 . 
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It should also be noted that only modes with square length up to n 2 < 15 are 
included in the calculation of the diaphony (including the determination of K), in 
anticipation of the sanie restriction on the estimator sums in later sections. 

In the plots that follow, the diaphony of the Niederreiter sequence in 
particular, but also that of the van der Corput sequence, exhibited a large 
variation in relatively small intervals of N . As the number of points N approaches 
certain criticai values the diaphony reaches very small levels, only to return to 
its 'cruising' values a few points later. To avoid cluttering the plots we present 
here the diaphony averaged in packs of 500 points without information on the 
minimum or maximum value found in each pack. The minimum values for each 
pack, that correspond to exceptional point configurations, are very interesting on 
their own but do not affect the present study. 



ari der Corput 
Niederreiter 



Figure 1: D=2 (left) and D=3 (right). The diaphony of RANLUX (red line), Van 
Der Corput (green line) and Niederreiter (blue line). 



The diaphony of the RANLUX sequence is seen to oscillate around 1, as ex- 
pected. Moreover the behavior of the Niederreiter sequence improves with 
the number of dimensions when compared with crude Van der Corput, an 
encouraging hint for higher dimensions. 



2.2 Generating function 

We shall now compute a 1 /N approximation to the moment-generating distribu- 
tion of the p-point probability distribution, that is, 

G p (z) = (exp(zD(X))V ^ ^ , (47) 

\ ' x p + , ,x p+2 ,...,x N 
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Figure 2: D=4 (left) and D=5 (righi). The diaphony of RANLUX (red line), Van 
Der Corput (green line) and Niederreiter (blue line). 

where we have indicated that the points x*i , Xa, . . . , x p are kept fixed while the 
remaining N — p points are integrated over. G p (z) therefore stili depends on 
x*i, . . . , Xp. This is most easily achieved using a diagrammatic approach, which 
has been introduced in Q. We shall indicate with crosses those points that are 
kept fixed (with an implied sum over them, from 1 to p), and with dots ('beads') 
those points that are integrated over (again, with an implied sum running from 
p + 1 to N). The function |3 is indicated by a solid line. As the simplest examples, 
then, we have 



1 1 N 

ifp = N: -x x = -^(3(x j -x k )=D(X) , 



and 



ifp=0: (D(X)U U ...^ =p(0)=Q = 1 
Other examples are 



(48) 



(49) 







(3(x! -x 2 ) 2 d d X! d d x 2 , 

(3 (xi — x 2 ) (3 (x 2 — x 3 ) (3 (x 3 — xi ) d d X! d d x 2 d d x 3 



(50) 



and so on: a general closed loop with precisely n beads will be denoted by (n). 
Note that, since, the functions eft(x) forni an orthonormal (and even a complete) 
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set, we have 

We can now simply write out ali possible (connected and disconnected) diagrams 
where every solid line ends in a cross or a bead, and apply the following Feynman 
rules: 

1. A factor 2z/N for every (3 line (where the factor 2 arises from the two 
possible orientations); 

2. A factor (N — p)- for every diagram (or product of diagrams) that contains 
precisely q beads 13 ; 

3. In addition, the usuai symmetry factors arising from equivalent lines and 
vertices, and from the repetition of identical (sub)diagrams. 

We shall compute G p (z) including terms of order 1 and those of order 1 /N. Note 
that 

(N -p)S = N< (l - ^ - + 0(N- 2 ) (52) 

as long as N 3> pq, q 2 . In the following we shall always assume this. 

First, we consider contributions without any crosses or nontrivial vertices. A 
general term in this class is given by 

where 

Q=r 1 +2r 2 + 3r 3 + --- ; (53) 

up to order 1/N 2 , this contribution to the generating function can therefore be 
written as 



pz 3 


z 2 3 2 


N" 3z 


2N 3z 2 


pz 3 


z 2 3 2 


N~3z" 


2N 3z 2 



1 ^-~)G^(z) . 
G [0] {z) = exp^-l£log(l-2zo- 2 i )j . (54) 



13 The 'falling power' is defined as a^= a!/(a-b)! = a(a- 1)(a-2) • • • (a-b + V 
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Up to 1/N 2 , one diagram with a four-point vertex may be present: a generic 
contribution of this type is 



( N _ p ) Q+^+1+^+1 / (2 z )m 1+ m 2 +2 m 1 f\f\ 



m. 



r 2 1 /4z 3 



l'3 



where mi t 2 denote the number of beads on each loop, excluding the one on the 
four- vertex. Let us define 



2zo 2 



(55) 



then, this contribution can be written as 

1 



GÌ 2) U) = — (J)(z;0) 2 G(°'(z) 



8N 



(56) 



Note that the lemniscate graph is actually equal to the product of two closed loops: 
this is a consequence of the translational invariance of the diaphony. A generic 
contribution containing two three-vertices is 



(]SJ _ p j Q+mi +Tti2 +m 3 +2 
]\| Q+mi +m 2 +m 3 +3 



(2z 



mi +rri2 +m3 +3 
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V 




x- 1 - [ z 
ri! 



r, ! 



T3 



r 2 ! V ^ J r 3 ! \ 3 
so that this contribution to the generating function reads 



G > ) = 72N G( ° )(Z 



cj)(z;x) 3 d d x 



(57) 



The diagrams with crosses have the generic contribution 



]\] Q+m+1 



z(2z) m X « « » « X 
x. m x, 
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^ 1 /4z 3 



l'3 



leading to 



G [ v 4] {z) = 2nG (0) (z) ^^(zjXj-xO 



j,k=1 



V i<j<i<<p / 



(58) 



where we have singled out the contributions with j = k. Ali other possible di- 
agrams either vanish because of translational invariance and tadpole-freedom, or 
are of order 1 /N 2 or lower. The final result for the generating function up to order 
1 /N 2 is therefore 



/ 



G P (z) 



G to )( 



1 4N 



<\>{z\x] 2 d d x + 



1 



12N 



cb(z;x) 3 d d x 



(59) 



i<j<k<p 



Note that the terni in G ^ (z) containing p cancels precisely against that in G ' 4 ' (z) , 
so that the only reference to p is in the last terni in brackets in Eq. (l59l . and indeed 
we have 

' G p (z) d d x p = G p _ 1 (z) . (60) 



In Appendix B we give the result for the higher order (Of^)) terni in G p . 
There are 25 terms that contribute but only three of them include p. The condition 
Esistili holds. 



2.3 Multi-point distribution by Laplace transform 

From the generating function, we can recover the actual probability distributions. 
As discussed above, let H(s) be the probability that the point set X has diaphony 
equal to s, that is, D(X) = s. The underlying ensemble of point sets is that of sets 
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of N iid uniformly distributed points, Le. the same ensemble underlying the usuai 
Monte Carlo error estimates. Then, we have 



Hfsì 



d d X! d d x 2 ---d a x N 5(D(X) -s] 



2l7T 



e- zs G (z) dz , 



(61) 



where the integration contour runs to the left of ali the singularities of Go(z); 
and the multi-point distribution for p points averaged over ali point sets X with 
diaphony s, is given by 



Pp (s, X] , X 2 , • • • > Xp) 
Rp(s, X"i , X 2 , • • • i Xp) 



1 



h( s ; 
1 

2Ì7T 



-Rp(s, X-] , X2, . . . , Xp) , 



e~ zs Gp(z) dz 



(62) 



Since we write the deviation from uniformity of the multi-point distribution as 



P p (s;xi,x 2 , . . . ,x p ) = 1 - ^rFp(s;x 1 ,x 2 , 



(63) 



we see that the multi-point correlation F p is, up to ), as claimed, built up from 
two-point correlatore 14 : forp > 3, 

p-1 

Fp(s;x 1 ,x 2) . . . ,x p ) = F p _i(s;xi,x 2 , . . .,Xp-i) + ^ F 2 (s;xj, x p ) , (64) 

3=1 

so that the p-point correlator is simply the sum of ali p(p — 1 )/2 2-point cor- 
relators. In the approximation used, the sub-leading terms in H(s) are actually 
irrelevant, and we may write 



H(s) 



1 



2Ì7T 



exp(t|>(s;z)) dz , 



14 This doesn't hold for the next order in i as seen in Appendix B. Terms like the one of 



Eq.( ll09> . that don't factorize, appear for p > 3. 
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1 

Ms;z) = -sz - ~Y_ log (l - 2zff|) , 



2' 

n 



F 2 (s;x 1 ,x 2 ) = 



2mH(s] 



exp(ib(s;z))cb(z;xi — x 2 ) dz . (65) 



Except in the very simplest cases 15 , a complete evaluation of Eq.(l631) is nontrivial. 
A simplification arises if s is much smaller than its expectation value 1 (which 
is anyway the aim in quasi-Monte Carlo), or if the Gaussian limit is applicabile, 
namely when the number of modes with non-negligible a\ becomes large in such 
a way that no single mode dominates. In practice, this happens when the dimen- 
sionality of C becomes large. Fortunately, these are precisely the situations of 
interest. The position of the saddle point for H(s), £, is given by 



n 



For s 1 , therefore, £ is large and negative. Since to first order the same saddle 
point may be used for R 2 , we find the attractive result 

- 2£ff^. 
F 2 (s;xi,x 2 ) « } iva e*(%i) e fl (x 2 ) , aia = — 2 n . (67) 
^— 2zai - 1 



n 



The formulae d66l) and (1671) suffice, in our approximation, to compute ali the multi- 
point correlations. 



We finish this section with the following observation. Suppose that F 2 is given 
as a function of x*i , x 2 . By Fourier integration we can then compute the w^. 
The assumption that the saddle-point approximation is valid, together with the 
normalization condition Y_ G \ — 1 » men allows us to write 

n n 

We see that F 2 not only determines the /orm of the diaphony, but in addition also 
its value. 

15 See sectionPHl 
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3 Application of Quasi-Monte Carlo estimators 



3.1 The mechanism behind error reduction 

After the above preliminaries we can now examine the mechanism by which Qua- 
si-Monte Carlo can outdo Monte Carlo. We shall assume the saddle-point approx- 
imation to be valid. For s < 1 , we then have 2 < 0, and ali the cuft are positive, 
and as £ — > —co they approach unity from below (although for |ft| — > oo they 
must always, of course, go to zero). Now notice that the set of functions ea[x) is 
complete, that is, 

Y_ e*(x0 e rt (x 2 ) = 6 d (xì - x 2 ) . (69) 

n 

This allows us to write the variance of the Monte Carlo error as 

2 



i 



f(x) e ft (x) d d x 



(70) 



where the contribution from the zero mode n = is canceled by the }] term. For 
Quasi-Monte Carlo on the other hand, we fìnd 



1 

N 



Le 



tu, 



f(x) e fl (x) d d x 



(71) 



We see that those modes n for which cu a. is positive tend to lead to an error re- 
duction. In the saddle-point approximation, therefore, any value < s < 1 will 
lead to a decreased error with respect to standard Monte Carlo. On the other hand, 
since 



< £ < min 



1 



,m, , for s > 1 , 



(72) 



large values of the diaphony will actually lead to an increase in the error. Note that 
in the above we have only used the fact that the form a complete, orthonormal 
set of functions: therefore, the error-reduction result holds for a much wider class 
of discrepancies than just the diaphonies discussed in this paper. 
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3.2 Estimators analyzed 

We can now arrive at an estimator for the Quasi-Monte Carlo errar. The simplest 
form is obtained by inserting Eq.dFTl) in the equation for E 2 (Eql 



with 

— 

1 - 2£oi 



-, 

n 



n 



constraints of Eq.d43t. Our choice is the so called Jacobi weights 



We are stili free to choose the exact form of the weights cr^ at will, under the 

16 



a\ = Ke-^ 2 (75) 

with 

K- 1 = Y e~^ 2 (76) 

* — n 

The parameter À controls the 'sensitivity' of the diaphony: as À — » we get 
o>t — > 1 for every mode which corresponds to a super-sensitive diaphony, useless 
for practical purposes, while as À — » oo only the modes with n 2 = 1 contribute 
making the diaphony fairly non-sensitive. We choose À = 0.1 . Other values of À, 
within a 'reasonable range' do not alter, in practice, the numerical value of E^', 
as shown in section lOl 

It is easy to see that the estimator averages (to leading order in N) in a positive 
definite quantity 17 . This leaves stili open the possibility for a negative errar esti- 
mate, particularly for relatively smooth functions where the cancellation between 
the two sums of the pseudo estimate are large leading to a small errar. The source 
of the negative errar effect is clear in the case of a Constant function. Then 

f(x) = C => E^ = -^C 2 ^ cun^UftCxOuftfo) (77) 

n i,j 

and the point sum of every Fourier mode can be anything frorn (when the points 
are spread evenly enough to produce complete cancellations for ali the included 
modes) to N 2 (when ali the points are on top of each other). The average of this 

16 due to their convenient factorizing property. 

17 It averages to Ea. (l71> which is positive definite as long as s < 1 . 
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sum is N (for truly random points), but for Quasi-Monte Carlo points we expect 
that this sum will be signifìcantly smaller than that. For non-constant functions 
similar effects can be expected, apart from the fact that the first two terms of E 2 q ' 
do not cancel anymore. Thus, we expect negative squared errors for higher modes 
or small number of points, and this is what has been observed in a number of plots. 
Unfortunately there is no way to predict precisely when, as N increases, the esti- 
mator gets a useful, positive value. One could resort to the error of E 2 , but that 
is cubie in the number of modes (see Eql29l and hence prohibitively expensive in 
realistic calculations. 

The way out of this is the estimator of Eq .(BTl) which can be written in a form 
with unrestricted sums as follows: 

n 

n n 

~mi T. w ^ - 2 + i u -i 2 k s 2 - s?) (78) 

where 

Uft= Y TJ-tv(Xì) 
i 

Wft = Y_ Uft(xi)f (xt) 

i 

Qtt = Y un(xi)f 2 (xi) 

i 

Si =^ fi (79) 

i 

It is identically zero for a Constant function, as can be easily checked, and av- 
erages to the leading order of the squared variance of E1 . The correction terms are 
of higher than leading order in N , but that does not mean that we have selectively 
included some NLO corrections to the variance. The correction terms above are 
such that the NLO terms vanish on the average. 
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In practice the infinite sum over modes in both estimators has to be truncated. 
This should not be perceived as an approximation of any kind. It amounts to 
a redefinition of the diaphony. Looking at Eq. dóóT) we see that as the value of 
s becomes small the saddle point becomes quickly large and negative: 2 <C 0. 
Then —2£<s\ — > oo for low modes and — 2£cr^ — » for higher modes, when 
Ofl/|2| — > 0. We can, thus, safely neglect these higher modes in the estimator. As 
long as the value of the diaphony is small, which is in any case the goal in Quasi- 
-Monte Carlo , the profile of cu a. depends only on the choice of À, which, as said, 
also regulates the sensitivity of the diaphony. We see therefore that the estimator 
inherits the sensitivity of the diaphony in a direct way. 

It is worth noting that the factorized forni of the |3 -function in the diaphony 
definition is directly responsible for the fact that the two estimators are now of 
complexity N x M (with M the number of modes) instead of quadratic in N. 
This is a desirable achievement as long as M < N, which we shall always assume 
to be the case. 



3.3 Numerical results 

In the following we will present a number of plots that show how both the 'clas- 
sical' and the quasi error estimates 18 behave as a function of the number of points 
N. In the process we will use the three types of point sequences defined in section 

uni 

A number of test functions were used for integrands. They consist of a subset 
of the test functions used by Schlier in @, along with a Gaussian function with 
dimension-dependent width. We have 

TF13:f(x) = n |4Xk ~ 2 ' + k (80) 
ti 1+k 

which averages to Ji = 1 . This test function is especially tailored for a Van der 
Corput sequence, since in D = 1 it is perfectly integrated by such a sequence with 
base 2. 

D 

TF2 : f [x] = Y[ kcos(kx k ) (81) 

lc=1 

18 The 'classical' or 'pseudo' estimator, E2, is the one of Eq.0, constructed on the assumption 
that the points are iid. By 'quasi' estimator, E 2 q , we mean the 'improved' estimator Ea.(f78t. 
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which averages to Jt = n k sin(k). This function should be difficult to integrate 
in high dimensions. 



D k 

TF4:f(x) = ^f];^ (82) 

k=1 j=1 

which averages to Ji = 1 — ^d-. It is chosen as a simple example of a function that 
is not a product of single-variable functions. 

A Gaussian with fixed width suffers from a rapid decrease, in higher dimen- 
sions, of the region of the integration volume where the function is non-zero, 
making the integration cumbersome (the higher the dimension, the more points 
are needed and inter-dimensional comparison is difficult). To avoid this we use 
instead 



D oo __i Y; _Y„ I -i- rL i1 2 



'-/lo 1 



TK:f(x)=nL ^ < 83 > 

i=1 n i= -oo V 

which is a product of superpositions of a Gaussian and its tails outside the [0, 1] 
interval. We wish to keep the variance of this function independent of the number 
of dimensions, so we define cr such that 

1 oo 

' y e --w = (1+v) i/D^ (84) 

za z — 

TU— OC 

where in practice it suffices to keep the first couple of terms in the sum. The 
function averages to Ji =1 and spreads as the number of dimension grows (a — > 
oo as D — ) oo). 

In the following plots the error and its estimates as functions of the number of 
points N are shown in a doublé logarithmic scale. 

The 'classical' error estimate is presented, along with three versions of quasi 
error estimators, E^ 5 , E^ 10 , E^ 15 . The superscript next to q denotes the squared 
length of the highest modes included in the sums of Eq. (l78l . Thus E^ 10 includes 19 
modes with ri 2 < 10. In table [T] we give the number of modes with n 2 < 15, 
and u < 5 for different dimensions. It is evident that the number of modes grows 
rapidly with the dimensionality. 

19 please note that the square length of a mode is the sum of the squares of D inte- 
gers. So for D = 2, for example, the modes present are those with square equal to 
1,2,4,5,9,10,13,16,17,... and, thus, E^ 15 ' actually contains modes with squared length up 
to 13. 
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D 


# of modes 


D 


# of modes 


1 


6 


1 


4 


2 


44 


2 


20 


3 


250 


3 


56 


4 


1256 


4 


136 


5 


5182 


5 


332 



Table 1 : number of modes with n 2 < 1 5 (left) and n 2 < 5 (right) 



The real error made is included for comparison. The data were collected in 
a point per point basis up to N = 1 5 . In the plots we have included the av- 
erage value of each error for successive subsets of 500 points, suppressing any 
information on minimum or maximum values in the subset 20 . 

Ali integrations are performed in the unit hypercube [0, 1] D . The dimension- 
ality varies from 2 to 6. 




Figure 3: TF2, d=2 log-plot using a Van der Corput sequence (left) and a 
Niederreiter sequence (right). The classical error estimator is far off the real 
error whereas the quasi estimators are approaching the real error as more modes 
are added to the sum. The need for more modes is, however, obvious, in both 
plots. 



20 The real error (in particular) fluctuates a lot as the quasi sets complete their successive cycles 
of low diaphony, but knowledge of the specific point where the error minimizes is of course not 
available a priori. 
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D=3 TF13 points:vdc 1=0.1 max=15 



D=3 TF13 poinls:niederreiter 1=0.1 max=15 




Figure 4: TF13, d=3 log-plot using a Van der Corput sequence (left) and a 
Niederreiter sequence (right). The quasi estimators follow the error with the 
appropriate N-dependence contrary to the pseudo estimator. Note that the E q14 
is in this case worse than E ql ° or E q5 for ali N < '\00000. The higher modes 
converge slower to their average value, but the cross-over point is not known in 
advance and it is function-dependent. 



D=4TF6 poin1s:vdc 1=0.1 



D=4 TF6 pointsmiedsrteiler l=l 





Figure 5: TF6, d=4 log-plot using a Van der Corput sequence (left) and a 
Niederreiter sequence (right). The quasi estimators approximate well the 
error. Moreover we see here a clearer instance of the crossover of higher modes 
in large N mentioned in the previous figure. 
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D=5 TF6 poin1s:vdc 1=0.1 max=15 



D=5 TF6 pointsmiederreiter 1=0.1 max=1 5 




Figure 6: TF6, d=5 log-plot using a Van der Corput sequence (left) and 
a Niederreiter sequence (right). The use of the improved estimator (Eql78l) 
reduces the probability of a negative error square estimate but, naturally, it doesn't 
remove it altogether. The plot on the right demonstrates this effect. As expected, 
the estimator returns to positive values and stabilizes as the number of points 
increases and the estimator converges to its average value. 



D=6TF6 poirits:vdc 1=0.1 max=15 D=6 TF6 points: niederreiter 1=0.1 max=15 




Figure 7: TF6, d=6 log-plot using a Van der Corput sequence (left) and a 
Niederreiter sequence (right). In this case the estimators describe very well 
the real error made in the integration. 
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4 Alternative approaches 



4.1 Raising the value of À in the Jacobi diaphony 

In general the real Quasi-Monte Carlo error is approached by including more and 
more modes in the estimator sum. At the same time, by including higher modes, 
one increases the error on this estimate (the error on E2) because one attempts 
to estimate by Monte Carlo means the integrai Jf(x)eft(x) which will fìuctuate 
vigorously for higher modes. 

One might then attempt to raise the value of À, thus decreasing the number 
of active modes (that give an appreciably non-zero C0ft)). This would of course 
reduce the sensitivity of the diaphony, artifìcially lowering its value. Improve- 
ment in the error estimate originating from higher modes would be lost but the 
contribution of the modes close to the origin (which are the ones included) would 
be relatively enhanced, as can be seen from the behavior of the weights co a (see 



D-3 Van derCorput 




D=3 Niederreiter 




Figure 8: TF6, d=4 log-plot using the Van der Corput (left) and the 
Niederreiter (right) sequences. E q15 is shown for different values of À in- 
dicated in the key, along with the real error and the classical estimate. Average 
values of ali quantities for sets of 500 points are shown in each case. The value 
of À doesn't alter the estimator, as long as that value stays within a specifìc range. 
We see that, in this case, the value À = 1 .6 is out of the safe range. 
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D=3 TF13 poinls:vdc 1=0.1 max=15 



D=5 TF6 points:vdc 1=0.1 max=15 



10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 



10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 



Figure 9: The ratio of different quasi estimators with the classical estimator for 
d = 3 and TF13 (left) and d = 5 TF6 (righi). In both cases Van der Corput 
point-sets were used. 



4.2 Monitoraci estimator 

The estimators F^ 5 , F^, 10 and F^ 15 are not always proportional to the classical 
estimate, and, in some cases they decreases quite faster with N than the classical 
estimate does. They never decrease slower than the classical estimate, though, and 
one can use that as follows. One monitors the ratio of F2 5 , for example, to the 
'classical' error estimate, and after a certain point 21 , the 'classical' error is only 
estimated and multiplied with that ratio. This is a purely linear algori thm and 
therefore very fast. Caution has to be exercised, though, in the way the criticai 
ration is chosen, in order to avoid confìgurations where the estimators acquire a 
very low value for some exceptional value of N . 

This approach relies heavily on the, frequently false, assumption that the quasi 
and classical estimators scale. If this is not so, the new estimate is conservative. 
One has, thus, the option to trade accuracy for cpu time. 

The plots of fìg|5]show the ratio of F^ 5 , F^ 10 and F^ 15 with F2 for two partic- 
ular cases. 

21 which depends on the resources of the user. 
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4.3 The box approximation 

There is a choice for the diaphony that allows us to perform the integrals of Eq.d631) 
without resorting to the saddle point approximation. That choice is 

ff * = M Et e(^<m) (85) 

H=1 ..d 

for some arbitrary m. The normalization (EqH3l determines M = (2m+ 1 ) D — 1 . 

This diaphony includes only a finite number of modes, ali of which are equally 
weighted. It can be seen as an approximation to the Jacobi diaphony since for 
small À the latter gives o"^ « 1 for |ri| < n c and ~ for \H\ > n c where n c is 
determined implicitly by the value of the Jacobi diaphony. The diaphony can be 
evaluated as a quadratic function on the point- set from 

fi i |ft|<m i i,j 

(86) 

with 



The distribution of point-sets with a particular value for s is then found by explic- 
itly performing the z- integrals of Eq. (l65l) : 

t/K„K-1 

H(s) = -p7^^ Ks (88) 
with K = M/2. Hence the correlation function is 

F(s; m - x,-) = - xj) (89) 



and the estimator 22 of Eq.d73l becomes 

N 2Z_'^ N 3 ^— N 3 M 



22 The use of the improved estimator of eq^U in the box approximation is prohibited by the 
quadruple sums that it would contain. 



30 



This form has the advantage of including ali modes up to an arbitrary m with- 
out much effort, with the overhead, of course, of being quadratic in N . As N 
grows beyond 1 5 this becomes particularly impractical. For investigating pur- 
poses, however, this approach is useful in testing the behavior of E 2 with more 
modes included (that is presumably the small À limit). 

It is remarkable that in the limit m^oowe have ib(x\ — Xj) = M-óy, and 
this leads to s = 1 

In that limit a good point-set would have to integrates well any mode using a finite 
number of points N. Since that is impossible, ali point-sets will be evaluated as 
equally bad by the particular diaphony. 

It is evident that one has to find an optimal value for m. In the follo wing plot 
the estimator E 2 q ' is shown for TF5 in 2 dimensions with different values for m 
ranging from 3 to 30. 
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Figure 10: TF6, d=2 log-plot of the real error, and then from top to bottom the 
classical estimate, E2 wmi tu = 2, with m = 3 and m = 5. The more modes 
one adds to the estimator the better it behaves. We also include the case m = 30 
(orange line), to demonstrate that there is a turning point in m above which the 
estimate becomes worse. Note that m = 5 means square length up to 2m 2 = 50, 
much higher than 15 that was our ceiling in the plots of the previous sections. 
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5 Concluding remarks 



• The use of Quasi-Monte Carlo point-sets in numerical integration achieves 
a smaller error than the use of pseudo-random Monte Carlo point-sets. This 
advantage cannot be put in use without a reliable method for estimating the 
integration error. 

• The 'classical', stochastic, error estimator relies on the assumption that the 
points in the point-set are uncorrelated. When used with a Quasi-Monte 
Carlo point-set, this assumption no longer holds. We saw that this leads to 
overestimating the error, thereby canceling any advantage gained by using 
the Quasi-Monte Carlo point-set. 

• An estimator of stochastic nature is stili possible but the underlying ensem- 
ble can not be the ensemble of ali point-sets. We advocate the use of the 
ensemble of point-sets with the same degree of uniformity, as measured by 
a chosen diaphony. This approach leads to a prescription for a correlation 
function and an estimator, without the use of any information on the partic- 
ular point-set or integrand. 

• The price to pay is the raise in the computational complexity of the estimator 
from linear to quadratic in the number of points, which reflects the inclusion 
in the estimator of correlations between pairs of points. Using properties of 
diaphonies one can revert to a complexity that is linear times the number of 
modes involved. 

• The error estimator suggested in this paper is shown to perform better than 
the 'classical' error estimator, resulting in an estimate up to an order of 
magnitude smaller than the 'classical' one. 

• The flexibility of the construction (reflected in the freedom to choose the 
precise diaphony and the number of modes included) allows one to trade ac- 
curacy for computational cost. In computationally expensive applications, 
the monitoring approach of section l4~2l could be used to obtain an estimate 
that lies somewhere between the 'classical' and the quasi regime. 

A number of further investigations have to be undertaken before implementing 
Quasi-Monte Carlo in the demanding fìeld of phase space integration in particle 
physics. We defer these and further testing of the error estimator suggested above, 
in realistic cases, to further work. 
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Appendix A: Estimators by diagrammatics 
Diagrammatics for Quasi-Monte Carlo and Monte Carlo 

Our strategy for obtaining the form of the estimators is best described by an ex- 
ample. Consider the triple sum 

N 

S P1 S P2 S P3 ^ Y. f i lf P f £ 3 • (92) 

i,j,k=l 
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In our approach we need to compute the expectation value of this object including 
the first sub-leading order in 1 /N. It is given by 



(S P1 S P2 S P3 ) = N- 
+N 2 - 



f r 1 f r f S 3 - n (?2(i ' j] + F 2(t,k) +F 2 (j,k)) 

f P,+P2 f P3 + f P,+P3 f P2 + fP, f P2+P 3 \ + 0(N j 



f p lf P if P3 _ N 2 



( f Pl+P2 f P3 +f Pl+P3fP2 +f P lf P2 



+P 3 \ 



(93) 



with implied integration over the subscripts. The sub-leading terms in the expecta- 
tion value are, therefore, obtained by either connecting any two of the summands 
in the multiple sum Ci with a factor —oc, or by contracting them. Now, any es- 
timator E consists of a linear combination of terms like the above. Its variance, 
(E 2 ) — (E) 2 , contains both leading and sub-leading terms. The leading terms, 
however, cancel completely, and so do the sub-leading terms coming from a con- 
nection/contraction inside one of the factors E. We arri ve at the following dia- 
grammatic prescription. A sum of powers of f will be represented by a labeled 
dot, and a connection (including the — a) by a link between dots. For example, 



N 



3 



1 4 2 



= L 

i,j,k,l= 



fffjf4f 2 a jk akl 



(94) 



Now, suppose that the estimator E is given as a linear combination of connected 
diagrams. The estimator of its variance is the given by the connected sub-leading 
diagrams that can be obtained from E x E. The factors 1 /N can be added in 
a straightforward manner: each sum with p different summing indices carries a 
factor N~ p , and there is an additional overall factor N in E 2 k. 



Estimators for Quasi-Monte Carlo 

We apply the above considerations to the first estimators E^ q j 4 for Quasi-Monte 
Carlo. Squaring and constructing the connected sub-leading diagrams, we find 



(q) _ • 

1 " 1 
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E< q) = !• • • m + • »+4» • 9+4 

4 1111 121 211 3 



+r? + ?- <95) 

Upon insertion of the correct factors of 1 /N, we arrive precisely at the estimators 
M 2 4 gi ven m this paper. The construction of Eg q ' is straightforward: at that 
order, tree diagrams with branches develop. It may be worth noting that in this 
diagrammatic approach it becomes immediately clear that no diagrams with loops 
(that is, occurrences of <x^, or oCyOCji, or ix^oc^oiki, and so on) are possible to this 
order in 1 /N . 

Estimators for Monte Carlo 

The MC estimators are of course precisely those of Quasi-Monte Carlo, with the 
replacement cx^ — > 1 . This means that the topology of the tree diagrams becomes 
irrelevant, and we can feasibly go up to E 16 . We find 

1 

E k = ^tL E ^ NS » K = 1, 2, 4,8,16 , (96) 

s=0 

where the coefficients of the various powers of N are given by 







Si , 


^2,0 




-S 2 
^1 » 


^2,1 




s 2 , 


^4,0 




-4S 4 , 


E 4 ,i 




8S 2 S 2 , 


U,2 




-S 2 -4S 1 S 3 , 


U,3 




s 4 , 


^8,0 




-256S? , 


E8,1 




1024S^S 2 , 


^8,2 




-1152S 4 S 2 -512S^S 3 , 


^8,3 




352S 2 S^ + 832S^S 2 S 3 + 224S 4 S 4 , 


^8,4 




-4S 4 2 - 224SìS 2 S 3 - 128S 2 S 2 - 208S 2 S 2 S 4 - 96S^S 5 


^8,5 




32S 2 S 2 + 8S 2 2 S 4 + 48S!S 3 S 4 + 48S^S2S 5 + 32S 2 S 6 , 


^8,6 




— S 4 — 8S 3 S5 — 4S 2 Sg — 8S1S7 , 



(97) 
(98) 



(99) 
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E 8) 7 = S 8 , (100) 

E 16 , = -4194304S] 6 , 

E 16 ,i = 33554432S] 4 S 2 , 

E 16>2 = -104857600S] 2 S^- 1677721 6S] 3 S 3 , 

E 16)3 = 162922496S] S 2 ; + 93585408S] 1 S 2 S 3 + 7733248S] 2 S 4 , 

E 16i4 = -132579328S*S 4 - 1895301 12S?SfS 3 - 201 85088S]°S 2 
-37552128S] S 2 S4-3538944S] 1 S 5 , 

E 16>5 = 54444032S^Sf+ 172064768S(S^S 3 + 69861 37éS^S 2 S^ + 63553536S^S^S 4 
+15532032S?S 3 S 4 + 14942208S?S 2 S 5 + 1507328S]°S 6 , 

E 16>6 = -9806848S 4 S 6 2 - 69660672S 5 ^S 4 2 S 3 - 77729792S\S\% - 451 9731 2SfSÌS 4 
-43855872S(S 2 S 3 S 4 - 5931 008S?S 3 S 5 - 21 1 35360S(S|S 5 
-622592S?S 7 - 5357568S*S 2 S 6 - 8060928S]Sl - 2802688S?S 2 , 

E 16 , 7 = 551 936S 2 S 2 '+1 41 80352S^ S 2 S^ + 1 0500096Sf SfS 3 + 320061 44S{S^S 2 
+1 281 6384StS^S 4 + 61 931 52S 6 ) S 2 S 2 4 + 36679680S^S 3 S 4 
+7016448S^S 2 S 4 + 13725696SfS 2 S 3 S 5 + 1 1 722752S^S^S 5 
+2007040S(S 4 S 5 + 6072320SfSÌS 6 + 1994752S(S 3 S 6 
+250880S^S 8 + 1 798144S(S 2 S 7 , 

E 16)8 = -256Sf- 643891 2SfS^S|- 381 9520S 2 S^S 2 -366592S 1 S|S 3 
-1 04601 6SfS^S 4 - 35681 28S 4 S 2 ; S 4 1 - 8730624S 4 S 2 S 2 S 4 
-2233344S 3 S 4 S 5 - 1807360S^S 3 S 2 - 9879552S?S 2 1 S 3 S 4 
-8638464StS|S 3 S 5 - 203571 2SfS|S 5 - 342016SfS| - 2471 936StS|S 6 
-3492864S^S 2 S 4 S 5 - 1 542144SfS|S 7 - 570368S^S 2 S 8 - 618496S^S 3 S 7 
-607232S^S 4 S 6 - 851 968S 4 S 4 - 96256S]S 9 - 3602432S^S 2 S 3 S 6 , 

E 16)9 = 542208S 1 S 2 -S 3 S 4 + 2359296S 2 S 2 S 2 S 4 + 1404160S 3 S 2 S 3 S 4 1 
+1 608704SfS|S 3 S 6 + 514432S 2 S^S 2 + 924672S 4 S 3 S 4 S 5 
+765952S 4 S 2 S 4 S 6 + 552960S 2 S 2 S 4 + 1405952S 2 S 2 4 S 3 S 5 
+1814528S^S 2 S^S 5 + 540672SìS^S| + 1394688S^S^S 4 S 5 
+262656SfS^S 6 + 808960S 4 S 2 S 3 S 7 + 90624SìSÌS 5 + 575488S 3 S 3 ; S 4 
+451584S 4 S 2 Sf + 191488S^S 5 S 6 + 475136S 4 S 2 S 6 + 164864S^S 4 S 7 
+42291 2SfSlS 7 + 1 79200S^S 3 S 8 + 343296S 4 S 2 S 8 + 1 67936S^S 2 S 9 
+1024SfS 4 + 33792StSi + 133760S 4 S 4 ' + 60416S|S| , 
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E 16)10 = -1 74336S 1 S 2 ! S 3 S 4 1 - 1 9]488S ì S 2 SlS 4 - 49920S?S 2 S 4 ; - 1 20832SiS|S 3 S 6 
-290304S?S 2 S 3 ! S 6 - ^83936S 2 Ì S 2 2 S 4 S 6 - 242688SìS^S|S 5 - 99328SìS|S 4 S5 
-87040S^S|S 5 - 231 936S^S 3 S 7 - 1 34400S^S 2 S 4 S 7 - 1 53728S^S 2 S 3 S 8 
-1 74080S^S 2 S 5 S 6 - 6451 2S 3 2 S 2 3 S 4 - 1 0521 6S 2 SlSj - 1 72288Sf S 3 S 4 S 6 
-291 84S^S 3 S 5 - 1 00352S?S|S 5 - 1 05472S? S 3 S^ - 111 360SfS^S| 
-901 12SfS|S 7 - 22528SìS4S 7 - 40000S?S 4 S 8 - 471 04StS 5 S 7 
-68608S 3 S 2 2 S 9 - 49088S?S|S 8 - 4761 6S?S 3 S 9 - 42240S?S 2 S 10 
-10752SfSn - ^52S 4 2 S 2 4 - 23552S 2 2 S 4 3 - 512SfS 6 - 23296SjS;? 
-456960S^S 2 S 3 S 4 S 5 - 16384StS| , 

E 16)11 = 3328S^S| + 20352S^S 3 S 4 S5 + 16576S 1 S 2 S^S 5 + 25088S 1 S^S 4 S 5 
+27136SiS 2 S 3 Sf + 4096S 4 3 S 4 + 33024S?S 3 S 5 S 6 + 26240S?S 2 S 3 S 9 
+32768SÌS2S2S7 + 1 7536SìS|S 4 S 7 + 51 84S 1 S 3 S 4 ' + 24064S?S 3 S 4 S 7 
+25856S^S 2 S 5 S 7 + 1 9456SìS|S 3 S 8 + 1 6448S?S 2 S 4 S 8 + 20224S 1 S^S 5 S 6 
+1 1392S 2 S|S| + 13440S^S 2 S^ + 1 1328SfS^S 6 + 10240S 2 S^S 5 
+-Ì6000S 2 S 4 S 2 + 224S^S 8 + 1 1520S|S§S 6 + 832S|S 4 S 6 + 13312S 7 S|S 6 
+1 3568S^S 8 + 9472S^S 6 S 7 + 691 2S^S 3 S 7 + 9984S^S 5 S 8 
+8832S?S 2 Sn + 10368S?S 3 S 1 o + 4224S 1 S 2 5 S 9 + 8576S?S 4 S 9 
+10176SfS^S 10 + 352S 2 l S 4 ' + 46336SÌS2S3S4S6 + 3008S|S 12 , 

E 1612 = -768S^S| - 2944S 2 S 3 S 5 S 6 - 1024S 3 S^S 5 -2208S 1 S 2 S 5 S 8 - 1664SìS 2 S 4 S 9 
-1216S 2 S 4 S| - 3072SiS 2 S 3 S 10 - 2944S 2 S 3 S 4 S 7 - 3584SìS 3 S 5 S 7 
-2944SÌS2S6S7 - 201 6SìS 3 S 4 S 8 - 3008SiS 4 S 5 S 6 - 768SfSf - 1024S|S 7 
-1536S^S 4 S 6 - 1472S^S 6 S 8 -224S 2 S^S 6 - 1920S!S 3 S|- 1024S!S^S 7 
-1472S 2 S|S 8 - 1408S^S 5 S 7 - 208S|S 4 S 8 - 960S 1 S^S 11 - 1792S?S 3 S n 
-1440S^S 2 S 12 -1312S?S 4 S 10 - 1 792S 1 S|S 9 - 1 088S|S 3 S 9 
-1 856S^S 5 S 9 - 768SiSl - 128S|S| - 704S^S 13 - AS 4 4 - 96S|S 10 , 

E 16>13 = 128S 2 Sf+128S^S 6 + 32S 4 S2 + 160S 3 S 5 S 8 + 48S 2 S 6 S 8 + 8S|S 8 
+256S 3 S 6 S 7 + 192S 4 S 5 S 7 + 128S^S 10 + 224S 1 S 5 S 10 + 160S 2 S 5 S 9 
+128S 3 S 4 S 9 + 1 92SiS 6 S 9 + 1 60SìS 7 S 8 + 224SìS 3 S 12 + 48S 2 S 4 S 10 
+32S|S 12 + 192S 2 S 3 Sn + 128S 1 S 4 S 11 + 160S 1 S 2 S 13 + 128SfS 14 , 

Ei6,i4 = — S g — I6S5S11 — I6S7S9 — 8SgSio 

— 4S 4 Si 2 — 16S 3 Si 3 — 16SiSi5 — 8S 2 Si 4 , 
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El6,15 — Sl6 



(101) 



The number of individuai terms in each E K is that of the partitions TT(K) of K: 
TT(1) = 1, TT(2) = 2, n(4) = 5, TT(8) = 22, and n(16) = 231. Likewise, the 
number of terms in each E K)S is the partition of K into (K — s) parts. We have not 
extended our results to the fìfth-order errar estimator with K = 32 and 17(32) = 
8349, since already E 8 and E 16 are purely academic and we have included them 
only as an illustration of the method. 



Appendix B: The O(^) contribution to G p 

The second order contribution to G p can be found by summing up O(^qj) terms 
coming from 

1. the pure rings (containing only 2-point vertices) 

2. the three graphs contributing to G^ 1 ' 2 ' 3 ' (containing one 4-vertex, two 3- 
vertices or two external points) 

3. products of a pure ring and one of the three graphs above or two of the 
graphs above. 

4. the new graphs (containing one 6-vertex,one 5-vertex and one 3-vertex, two 
4- vertices, one 4-vertex and two 3-vertices, four 3-vertices, one 3-vertex and 
three external points, two 3-vertices and two external points or one 4-vertex 
and two external points) 

After a lengthy but straightforward calculation (involving some cancellations) we 
get 

Gjf» =W (^ E ^K 3 -ì(K 2 ) 2 -ÌK 2 (x l> x J )) 
( ^K 5 + ^K 4 -^K 2 -lK 2 K 1 (x i ,x j ) 

-^sUjj ~ Jl 3>1>1 + ^K 1 (x i ,x j ) 2 + ^Qi(Xi,Xj,x k ) 
+Iq2{x u x ì ) + ^Q 3 (xi,Xj,x k ) + jQ 4 (xi,Xj) 
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1 1 1 2 

+-L 1)1)1 (x i ,Xj,x k J + — Li,i,iKi(xì,Xj) + 233 L i,i,i 

+ _L Mi + 1m 2 + 1m 3 + 1m 4 ) 



where 



K a ,b,...= Y- P " P 2--- (102) 
1,2,... 

/ 

K a (x i ,xj) = ^^pfe iil (x i )e^x,) (103) 

ij 1 

L Q)b>c = Y_ P"P2P35l+2+3 (104) 
1,2,3 

Qi{x h x h x k ) = YY PiP2 e ni(^)eft 1 (x j )e ft2 (x j )e* l2 (x k ) (105) 

i,j,k 1,2 

Q 2 (xì,xj) =^^p 1 p 2 e i t 1 (x i )eft ] (xj)et 2 (x i )e f t 2 (x i ) (106) 

i,j 1,2 

i 

Q3 (xì, Xj , x k ) = Y_ Y. pl p2e ^ e ftr ( x i ) e ^ (*j ) e * 2 e* 3 (xj ) (x k ) 

i,j,k 1,2,3 

(107) 

Q 4 (x i) x j ) = ^^pfp 1+2 e rtl (x i )e^(x j ) (108) 
tj 1,2 

La,b,c(x i ,x j ,x k ) = ^ ^ p^p^p§e rtl (xOefl^xjJe^xic) 81+2+3 (109) 

i,j,k 1,2,3 

Qa,b(xi,Xj,x k ) =^^p^e ftl (xOet (x^e^^x^et^Xk) (110) 

i,j,k 1,2 

Mi = ^ Plp2p3p45l+2+3+4 (IH) 
1,2,3,4 

M 2 = Y_ P1P2P3P4P551+2-5S3+4-1-2S5-3-4 (H2) 

1,2,3,4,5 

M 3 = ^ PlP2P3P4P5P65l-2-552-3-653-1-454+5+6 (H3) 
1,2,3,4,5,6 
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M4 = PlP2P3P4P5P65l-2-552-3-653+6-454-5-1 (H4) 

1,2,3,4,5,6 

with 

2zo£ 

» s T=2%r (115) 

and LlìX... = Lx^^...' Ll A ... = Ln,,n 2 ,... and Sl+8_2+... = 6(n! + u 8 - 

ri 2 +...)• 
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real error 



pseudo estimate 
estimate modes 1 
estimate modes 3 




1e-04 - 



1e-05 - 



1e-06 - 



1e-07 1 1 1 1 — : 1 1 1 1 1 1 1 1 — 

1000 10000 



